A Parallel Data Storage Interface to GridFTP

نویسندگان

  • Alberto Sánchez
  • María S. Pérez-Hernández
  • Pierre Gueant
  • Jesús Montes
  • Pilar Herrero
چکیده

Most of the grid projects are characterized by accessing huge volumes of data. For supporting this feature, different data services have arisen in the “grid” world. One of the most successful initiatives in that field is GridFTP, a high-performance transfer protocol, based on FTP but optimized for wide area networks. Although GridFTP provides reasonably good performance, GridFTP servers keep constituting a bottleneck for data-intensive applications. One of the most important modules of a GridFTP server is the Data Storage Interface (DSI), which specifies how to read and write to the storage system, allowing the server to transform the data. With the aim of improving the performance of the GridFTP server, we have designed a new DSI, based on MAPFS, a parallel file system. This paper describes this new DSI and its evaluation, showing the advantages of dealing data through this optimized GridFTP server.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

pNFS and Linux: Working Towards a Heterogeneous Future

Heterogeneous and scalable remote data access is a critical enabling feature of widely distributed collaborations. Parallel file systems feature impressive throughput, but sacrifice heterogeneous access, seamless integration, security, and cross-site performance. Remote data access tools such as NFS and GridFTP provide secure access to parallel file systems, but either lack scalability (NFS) or...

متن کامل

A GridFTP Transport Driver for Globus XIO

GridFTP is a high-performance, reliable data transfer protocol optimized for high-bandwidth wide-area networks. Based on the Internet FTP protocol, it defines extensions for highperformance operation and security. The Globus implementation of GridFTP provides a modular and extensible data transfer system architecture suitable for wide area and high-performance environments. GridFTP is the de fa...

متن کامل

GridFTP-APT: Automatic Parallelism Tuning Mechanism for GridFTP in Long-Fat Networks

In this paper, we propose an extension to GridFTP that optimizes its performance by dynamically adjusting the number of parallel TCP connections. GridFTP has been used as a data transfer protocol to effectively transfer a large volume of data in Grid computing. GridFTP supports a feature called parallel data transfer that improves throughput by establishing multiple TCP connections in parallel....

متن کامل

Scheduling On-demand Data Streaming for Cyberinfrastructure Applications with Constraints of Storage and Bandwidth

Cyberinfrastructure is proposed by the US/NSF as the new century’s infrastructure for scientific research and discovery, aiming to provide cyberenvironments to enable scientific applications onto cyberresources, e.g. high performance computers, data archives, software service, distributed communities, telescopes and observatories. This work is focused on cyberinfrastructure applications with da...

متن کامل

Optimising LAN access to grid enabled storage elements

When operational, the Large Hadron Collider experiments at CERN will collect tens of petabytes of physics data per year. The worldwide LHC computing grid (WLCG) will distribute this data to over two hundred Tier-1 and Tier-2 computing centres, enabling particle physicists around the globe to access the data for analysis. Although different middleware solutions exist for effective management of ...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2006